Goto

Collaborating Authors

 separate voice


Learning to Separate Voices by Spatial Regions

Xu, Zhongweiyang, Choudhury, Romit Roy

arXiv.org Artificial Intelligence

We consider the problem of audio voice separation for binaural applications, such as earphones and hearing aids. While today's neural networks perform remarkably well (separating $4+$ sources with 2 microphones) they assume a known or fixed maximum number of sources, K. Moreover, today's models are trained in a supervised manner, using training data synthesized from generic sources, environments, and human head shapes. This paper intends to relax both these constraints at the expense of a slight alteration in the problem definition. We observe that, when a received mixture contains too many sources, it is still helpful to separate them by region, i.e., isolating signal mixtures from each conical sector around the user's head. This requires learning the fine-grained spatial properties of each region, including the signal distortions imposed by a person's head. We propose a two-stage self-supervised framework in which overheard voices from earphones are pre-processed to extract relatively clean personalized signals, which are then used to train a region-wise separation model. Results show promising performance, underscoring the importance of personalization over a generic supervised approach. (audio samples available at our project website: https://uiuc-earable-computing.github.io/binaural/. We believe this result could help real-world applications in selective hearing, noise cancellation, and audio augmented reality.


Can Computers Learn Like Humans?

#artificialintelligence

The world of artificial intelligence has exploded in recent years. Computers armed with AI do everything from drive cars to pick movies you'll probably like. Some have warned we're putting too much trust in computers that appear to do wondrous things. But what exactly do people mean when they talk about artificial intelligence? It's hard to find a universally accepted definition of artificial intelligence.


Whose Line Is It Anyway? Creating AI That Accurately Separates Voices on Sales Calls

#artificialintelligence

To attack this problem, our research team developed a patent-pending framework that uses Deep Learning to automatically generate a "voice fingerprint" for each sales rep using a combination of vocal characteristics. During the sales call itself, we cluster the audio signals based on those characteristics with each cluster representing a speaker. The voice fingerprints we stored play a crucial role not only in associating each speaker with the right cluster, but in the clustering process itself: the models we trained with the fingerprints allow us to learn and apply mathematical transformations to the audio, which render the differences between different speakers more distinct. See the before and after graphs below.


Google Develops AI That Can Separate Voices in a Crowd

#artificialintelligence

Google Research engineers have developed a deep learning system that can separate voices from audio-visual data recorded in crowded environments. The system they developed is the equivalent of the "cocktail party" effect, a feature of the human brain that can isolate and focus on one or more particular voices in a crowd. The system is designed to work with both audio and video data at the same time. Google says it created its novel tech by feeding it over 100,000 high-quality videos of lectures and talks hosted on YouTube. All talks were given by a single speaker, with minimal background noise. They trained the AI to recognize sounds based on lip/mouth movement.


Can Computers Learn Like Humans?

NPR Technology

The world of artificial intelligence has exploded in recent years. Computers armed with AI do everything from drive cars to pick movies you'll probably like. Some have warned we're putting too much trust in computers that appear to do wondrous things. But what exactly do people mean when they talk about artificial intelligence? It's hard to find a universally accepted definition of artificial intelligence.